Adaptive Abstraction for Model-Based Reinforcement Learning
نویسنده
چکیده
This paper presents a novel model-based reinforcement learning framework called the Adaptive Modelling and Planning System (AMPS). The challenge of a model-based reinforcement learning agent is using experience in the world to generate a model. In problems with large state and action spaces, the agent must generalise from limited experience by grouping together similar states and actions, effectively partitioning the state and action spaces into finite sets of regions. Several different abstraction approaches have been proposed in the literature, but the existing algorithms have many limitations. They generally only increase resolution, require a large amount of data before changing the abstraction, do not generalise over actions, and are computationally expensive. AMPS aims to solve these problems using a new kind of approach. AMPS splits and merges existing regions in its abstraction according to a set of heuristics. The system introduces splits using a mechanism related to supervised learning and is defined generally, allowing AMPS to leverage a wide variety of representations. The system merges existing regions when an analysis of the current plan indicates that doing so could be useful. Because several different regions may require revision at any given time, AMPS prioritises revision to best utilise whatever computational resources are available. Changes in the abstraction lead to changes in the model, requiring changes to the plan. AMPS prioritises the planning process, and when the agent has time, it replans in high-priority regions. This paper demonstrates the flexibility and strength of this approach in learning intelligent behaviour.
منابع مشابه
Reinforcement Learning Based PID Control of Wind Energy Conversion Systems
In this paper an adaptive PID controller for Wind Energy Conversion Systems (WECS) has been developed. Theadaptation technique applied to this controller is based on Reinforcement Learning (RL) theory. Nonlinearcharacteristics of wind variations as plant input, wind turbine structure and generator operational behaviordemand for high quality adaptive controller to ensure both robust stability an...
متن کاملMini/Micro-Grid Adaptive Voltage and Frequency Stability Enhancement Using Q-learning Mechanism
This paper develops an adaptive control method for controlling frequency and voltage of an islanded mini/micro grid (M/µG) using reinforcement learning method. Reinforcement learning (RL) is one of the branches of the machine learning, which is the main solution method of Markov decision process (MDPs). Among the several solution methods of RL, the Q-learning method is used for solving RL in th...
متن کاملAn Adaptive Learning Game for Autistic Children using Reinforcement Learning and Fuzzy Logic
This paper, presents an adapted serious game for rating social ability in children with autism spectrum disorder (ASD). The required measurements are obtained by challenges of the proposed serious game. The proposed serious game uses reinforcement learning concepts for being adaptive. It is based on fuzzy logic to evaluate the social ability level of the children with ASD. The game adapts itsel...
متن کاملHIERARCHICAL REINFORCEMENT LEARNING WITH FUNCTION APPROXIMATION FOR ADAPTIVE CONTROL by MARGARET
by MARGARET MARY SKELLY This dissertation investigates the incorporation of function approximation and hierarchy into reinforcement learning for use in an adaptive control setting through empirical studies. Reinforcement learning is an artificial intelligence technique whereby an agent discovers which actions lead to optimal task performance through interaction with its environment. Although re...
متن کاملTD Models: Modeling the World at a Mixture of Time Scales
Temporal-diierence (TD) learning can be used not just to predict rewards, as is commonly done in reinforcement learning, but also to predict states, i.e., to learn a model of the world's dynamics. We present theory and algorithms for intermixing TD models of the world at diierent levels of temporal abstraction within a single structure. Such multi-scale TD models can be used in model-based rein...
متن کامل